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LISTING OF CLAIMS 

I. -9. (canceled) 

10. (currently amended) A computer-implemented method for applying XML-compatible markup to 
unstructured textual documents, the method comprising: 

defining an XML schema in accordance with which documents are to be marked up; 

opening a giv e n target document in a host Application Programming Interface (API) enabled 
gen e ric wordprocessor application capable of storing XML-compatible non-native markup in-ks 
documents; 

using an the- API of the host wordprocessor application to parse th e docum e nt content included in 
the target document and to perform element pattern matching aad-toyieldiftg inferred XML structure in 
accordance with the chosen defined XML schema; and 

storing the inferred XML structure within the target document as XML-compatible markup via 
the API of the wordprocessor h es^application. 

I I . (currently amended) A method as recited in claim 10 wherein said using step comprises a 
structur e inf e r e nc e m e thod for parsing a giv e n docum e nt; recognizing instances of designated baseline 
elements via pattern search and matching; and inferring and constructing higher-level element structure in 
best possibl e substantial conformance with the defined XML schema. 

12. (currently amended) A method as recited in claim 10 wherein the original visual formatting and 
textual content of the target document remain intact after applying XML markup to it storing the inferred 
XML structure within the target document as XML-compatible markup . 

13. (currently amended) A method as recited in claim 1 0 further comprising limiting wh e r e in the 
XML structure inference and markup creation ar e limit e d to a select range or number of select ranges of 
the target document. 

14. (currently amended) A method as recited in claim 10 further comprising creating wh e r e in a 
structure inference definition for the chos e n defined XML schema is cr e at e d u sing by means of a 
dedicated Graphical User Interface (GUI) integrated in the a GUI workspace of the host wordprocessor 
application. 

15. (currently amended) A method as recited in claim 1 0 further comprising presenting the a user 
with a GUI m e ans to review probable trouble spots in the target document and to manually correct and 
complete the automatically generated XM L-compatible markup, the probable trouble spots comprising 
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unmarked ranges, missing required elements from the defined XML schema, and inferred XML structure 
being invalid according to the defined XML schema. 

16. (canceled) 

17. (new) A method as recited in claim 10 wherein opening a target document in a host Application 
Programming Interface (API) enabled wordprocessor application capable of storing XML-compatible 
non-native markup in documents includes opening the target document in a host API enabled 
wordprocessor application that includes a plug-in capable of storing XML-compatible markup. 

18. (new) A method as recited in claim 10 further comprising: 

identifying a target document type from a set of textual documents with generally consistent 
inherent logical structure and formatting; 

creating a structure inference definition for the target document type comprising a multiplicity of 
definitions of baseline elements, the baseline elements being select leaf-level or near-leaf-level elements 
from the target document type and having a schema context; and 

defining recognition patterns for the baseline elements. 

1 9. (new) A method as recited in claim 1 8 further comprising invoking a computer-executable engine 
to apply the structure inference definition to one or more instances of the target document type to produce 
XML structure relating to the defined schema, the operation of said engine comprising: parsing the one or 
more instances of the target document type. 

20. (new) A method as recited in claim 19 further comprising defining patterns and structure 
inference and construction rules for one or more levels of nested elements in a designated baseline 
element, and configuring the computer-executable engine to use said patterns and rules to produce nested 
element structure within a text range and the schema context of a baseline element. 

2 1 . (new) A method as recited in claim 1 9 further comprising: 

deriving a state machine having transition labels by recursive aggregation of schema element 
content models, starting from a designated root element and moving to the level of designated baseline 
elements; 

incorporating identities and specific instances of baseline elements in the transition labels of the 
state machine; and 

configuring the computer-executable engine to compile and use the state machine to consider a 
relatively small number of expected baseline elements at a given document position. 
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22. (new) A method as recited in claim 10 wherein opening a target document in a host Application 
Programming Interface (API) enabled wordprocessor application capable of storing XML-compatible 
non-native markup in documents includes detecting the target document in a predefined incoming 
document folder or receiving the target document via the API from an external client component. 

23. (new) A method as recited in claim 22 wherein using an API of the wordprocessor application to 
parse content included in the target document and to perform element pattern matching to yield inferred 
XML structure in accordance with the defined XML schema includes using the API of the wordprocessor 
application to automatically parse the content included in the target document and to perform element 
pattern matching to yield inferred XML structure in accordance with the defined XML schema after 
detecting the target document in a predefined incoming document folder or after receiving the target 
document via the API from the external client computer. 

24. (new) A method as recited in claim 22 further comprising creating a structure inference definition 
for the target document comprising a multiplicity of definitions of baseline elements, the baseline 
elements being select leaf-level or near-leaf-level elements from the second target document and having a 
schema context and defining recognition patterns for the baseline elements. 

25. (new) A method as recited in claim 10 wherein opening a target document in a host Application 
Programming Interface (API) enabled wordprocessor application capable of storing XML-compatible 
non-native markup in its documents includes opening multiple target documents. 

26. (new) A method as recited in claim 25 wherein using an API of the wordprocessor application to 
parse content included in the target document and to perform element pattern matching to yield inferred 
XML structure in accordance with the defined XML schema includes using the API of the wordprocessor 
application to parse content included in the multiple target documents sequentially or in parallel in an 
unattended batch mode. 

27. (new) A method as recited in claim 25 further comprising creating a structure inference definition 
for the multiple target documents comprising a multiplicity of definitions of baseline elements, the 
baseline elements being select leaf-level or near-leaf-level elements from the multiple documents and 
having a schema context and defining recognition patterns for the baseline elements. 

28. (new) A method as recited in claim 1 8 wherein creating a structure inference definition for the 
target document type comprising a multiplicity of definitions of baseline elements, the baseline elements 
being select leaf-level or near-leaf-level elements from the target document type and having a schema 
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context includes identifying a baseline element by a schema path comprising a sequence of one or more 
XML element or element type steps, a first one of the one or more XML element or element type steps 
designating a global schema element or type and each subsequent step designating a child element or 
element group of its predecessor. 

29. (new) A method as recited in claim 1 8 further comprising defining the recognition patterns for the 
baseline elements to include: text patterns selected from the group of literals, wildcards, and regular 
expressions; formatting patterns selected from the group of font style, font name, font size, composite 
style name, paragraph indentation, and outline level; and logical compositions of atomic text and 
formatting patterns and pattern groups. 

30. (new) A method as recited in claim 1 8 further comprising defining the recognition patterns for the 
baseline elements to include: 

an optional leading pattern, intended to match a document range immediately preceding a content 

range of the baseline element, allowing intervening whitespace; 

an optional content pattern, intended to match the content range of the baseline element; and 
an optional trailing pattern, intended to match a document range immediately following the 

content range for the baseline element, allowing intervening whitespace, an end document position of the 

trailing pattern element serving as a starting position for matching recognition patterns of following 

baseline elements. 

3 1 . (new) A method as recited in claim 1 8 wherein the defining of recognition patterns for the 
baseline elements comprises assigning a priority value or pattern weight value which influences a 
selection of one baseline element when the recognition patterns for more than one baseline element yield 
competing/ambiguous matches at a particular document position. 
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